fix(client): treat HTTP 404 with session ID as session expiry#2125
fix(client): treat HTTP 404 with session ID as session expiry#2125dsp-ant wants to merge 1 commit into
Conversation
Per the MCP spec (Streamable HTTP, Session Management), when a client receives an HTTP 404 in response to a request that carried an Mcp-Session-Id, the session has expired or been terminated server-side and the client must start a new session. StreamableHTTPClientTransport previously surfaced every non-401/403 error status — including 404 — as a generic ClientHttpNotImplemented (POST) or ClientHttpFailedToOpenStream (GET) error, with no way to distinguish session expiry from other failures. Consumers were left matching the response body, which only works against the reference server; servers that report expiry with a different body (e.g. a -32002 JSON-RPC code, or a plain-text/HTML proxy response) slipped through. Detect session expiry by status code alone, scoped to requests that actually carried a session ID: on a 404 when a session ID is set, clear the stale session ID (so a subsequent connect() issues a fresh initialize) and throw SdkError with the new SdkErrorCode.ClientHttpSessionExpired. A 404 without a session ID (e.g. wrong URL on initial connect) is unchanged and still surfaces as ClientHttpNotImplemented. Applied to both the POST (_send) and standalone GET SSE (_startOrAuthSse) paths; documents the new code in the migration guides.
|
@modelcontextprotocol/client
@modelcontextprotocol/server
@modelcontextprotocol/express
@modelcontextprotocol/fastify
@modelcontextprotocol/hono
@modelcontextprotocol/node
commit: |
| // Per the MCP spec (Streamable HTTP, Session Management): a 404 in | ||
| // response to a request that carried an `Mcp-Session-Id` means the | ||
| // session has expired or been terminated server-side, and the client | ||
| // must start a new session. Detect this by the status code alone — | ||
| // not the response body — since non-reference servers report it with | ||
| // varying bodies (different JSON-RPC error codes, plain text, HTML). | ||
| // Clear the dead session ID so a subsequent reconnect issues a fresh | ||
| // `initialize`, and surface a distinct, body-agnostic error code. | ||
| if (response.status === 404 && this._sessionId !== undefined) { | ||
| this._sessionId = undefined; | ||
| throw new SdkError(SdkErrorCode.ClientHttpSessionExpired, `Session expired (HTTP 404): ${text}`, { | ||
| status: 404, | ||
| text | ||
| }); | ||
| } | ||
|
|
There was a problem hiding this comment.
🟡 The 404 session-expiry check reads this._sessionId after the response's mcp-session-id header has already been written into it a few lines above, so it tests post-response state rather than whether the request carried a session ID (which is what the spec rule and the inline comment describe). Snapshot const hadSessionId = this._sessionId !== undefined before the await fetch (or check the request headers) and branch on that. The same hunk in _startOrAuthSse reads consistently because the GET path never writes the response header back, but using the same snapshot pattern there keeps the two branches symmetric.
Extended reasoning...
What the check actually tests
The new 404 branch in _send is intended to implement the MCP spec rule: "When a client receives HTTP 404 in response to a request containing an Mcp-Session-Id, it MUST start a new session." The inline comment above the branch restates this exactly — "a 404 in response to a request that carried an Mcp-Session-Id".
But the predicate this._sessionId !== undefined is evaluated after the response header handling that runs unconditionally at the top of _send's response processing:
const sessionId = response.headers.get('mcp-session-id');
if (sessionId) {
this._sessionId = sessionId; // mutation happens here, even on !response.ok
}
if (!response.ok) {
// ...401/403 handling...
if (response.status === 404 && this._sessionId !== undefined) { // reads post-mutation state
this._sessionId = undefined;
throw new SdkError(SdkErrorCode.ClientHttpSessionExpired, ...);
}
}So the check answers "is a session ID set now, after the response has been processed?" — not "did the request we just sent carry one?". Those two questions diverge in two ways.
How they diverge
(1) A 404 response that itself carries an mcp-session-id header. A request goes out with no session ID; the server responds 404 and includes mcp-session-id in the response headers. Lines a few up store it into this._sessionId; the 404 check then sees a non-undefined session ID, misclassifies the error as ClientHttpSessionExpired, and immediately clears the session ID it just stored. Per the spec this should be ClientHttpNotImplemented (the request had no session). The PR's negative test (treats a 404 without a session ID as a generic HTTP error) only mocks headers: new Headers(), so this path is untested.
(2) Concurrency. Two _send calls in flight: request A is an initialize whose 200 response sets this._sessionId; request B was dispatched earlier with no session ID and gets a 404. If B's response is processed after A's, the 404 check reads A's freshly-established session ID, misclassifies B's 404, and clears the session A just created. The session is wedged until the caller reconnects.
Step-by-step proof (scenario 1)
- A fresh transport (no
sessionIdoption) callssend(initialize)._commonHeaders()returns headers withoutmcp-session-idbecausethis._sessionIdisundefined. - The (misbehaving) server responds
404 Not Foundwith headermcp-session-id: abc. response.headers.get('mcp-session-id')→'abc'→this._sessionId = 'abc'.!response.ok→ enter error block. The 404 check seesthis._sessionId === 'abc' !== undefined→ true.- Throws
SdkError(ClientHttpSessionExpired)and clearsthis._sessionId. Per the spec this is wrong — the request had noMcp-Session-Id, so it should be a genericClientHttpNotImplemented.
Why this is low-severity
Both triggers are unlikely in practice: real servers don't set mcp-session-id on 404 error responses, and the Protocol layer serializes initialize so the concurrency window is narrow (it requires direct transport.send() use outside Client.connect()). I'm filing this as a nit — a spec-precision issue with a one-line fix, not a likely real-world failure.
Fix
Snapshot the request-side state before the fetch and branch on that:
const hadSessionId = this._sessionId !== undefined; // before await fetch
// ...
if (response.status === 404 && hadSessionId) { ... }Or equivalently, check headers.has('mcp-session-id') on the already-built request headers.
The mirrored hunk in _startOrAuthSse happens to read correctly today (the GET path never writes the response mcp-session-id header back into this._sessionId), but using the same snapshot pattern there keeps the two branches symmetric and future-proofs against someone adding session-header handling to the GET path.
| // Per the MCP spec (Streamable HTTP, Session Management): a 404 in | ||
| // response to a request that carried an `Mcp-Session-Id` means the | ||
| // session has expired or been terminated server-side, and the client | ||
| // must start a new session. Detect this by the status code alone — | ||
| // not the response body — since non-reference servers report it with | ||
| // varying bodies (different JSON-RPC error codes, plain text, HTML). | ||
| // Clear the dead session ID so a subsequent reconnect issues a fresh | ||
| // `initialize`, and surface a distinct, body-agnostic error code. | ||
| if (response.status === 404 && this._sessionId !== undefined) { | ||
| this._sessionId = undefined; | ||
| throw new SdkError(SdkErrorCode.ClientHttpSessionExpired, `Session expired (HTTP 404): ${text}`, { | ||
| status: 404, | ||
| text | ||
| }); | ||
| } | ||
|
|
There was a problem hiding this comment.
🟡 The PR adds 404-with-session-ID handling to the POST (_send) and standalone GET SSE (_startOrAuthSse) paths, but terminateSession() — which always sends a DELETE carrying Mcp-Session-Id since it returns early when there is no session ID — still treats a 404 as a generic failure: it throws ClientHttpFailedToTerminateSession before reaching this._sessionId = undefined, so the dead session ID is retained and the caller gets a misleading error even though the session is in fact already gone. Per the same spec rule applied on the other two paths, a 404 here should clear _sessionId and likely succeed silently (the session is already terminated server-side, which is what the caller asked for).
Extended reasoning...
What the bug is
This PR introduces the rule "a 404 in response to a request carrying Mcp-Session-Id means the session expired server-side" and applies it to two of the three code paths in StreamableHTTPClientTransport that can carry a session ID: _send (POST) and _startOrAuthSse (GET). The third path, terminateSession(), also always carries a session ID — it returns early at the top of the method when !this._sessionId — but is left untouched.
The code path that triggers it
async terminateSession(): Promise<void> {
if (!this._sessionId) {
return; // No session to terminate
}
// ... DELETE with Mcp-Session-Id header via _commonHeaders() ...
if (!response.ok && response.status !== 405) {
throw new SdkError(SdkErrorCode.ClientHttpFailedToTerminateSession, ...);
}
this._sessionId = undefined; // never reached on 404
}On a 404 the method falls into the !response.ok && status !== 405 branch and throws, so this._sessionId = undefined on the next line never executes.
Step-by-step proof
- Client establishes a session;
_sessionId = 'abc'. - Server-side, the session expires (idle timeout, restart, etc.).
- Client calls
transport.terminateSession(). - The early-return guard passes (
_sessionIdis set), so a DELETE is sent withMcp-Session-Id: abc. - Server replies
404 Not Foundbecause it has no record of sessionabc. response.okisfalse,response.status !== 405, soClientHttpFailedToTerminateSessionis thrown.this._sessionIdis still'abc'— the dead ID survives, and the caller is told termination failed even though the session is already gone (which is exactly the outcome the caller wanted).
Why this is the same wedge the PR is fixing
The PR description frames the motivation as "the session stays wedged until the app is restarted" because consumers couldn't distinguish session expiry from other HTTP failures. terminateSession() exhibits the identical symptom: after step 7 the transport still holds a stale session ID, and the surfaced error code (ClientHttpFailedToTerminateSession) is misleading — it implies an actionable failure when in fact nothing went wrong.
Why the impact is bounded (and why this is a nit rather than a blocker)
terminateSession()is typically called on shutdown, so the lingering_sessionIdis usually about to be discarded anyway.- With this PR's other changes in place, if the caller does keep the transport alive and issues a subsequent POST, the new
_send404 handler will now clear the stale ID — so the wedge is no longer permanent the way it was before this PR.
That said, the inconsistency is real and the misleading error is user-visible: callers that surface or log termination failures will report a spurious error.
How to fix it
Treat 404 like the existing 405 special-case: when the DELETE returns 404, clear _sessionId and return successfully (the user's intent — terminating the session — has already been achieved server-side). Alternatively, throw ClientHttpSessionExpired after clearing _sessionId for consistency with the other two paths, but silent success is closer to the spec semantics for DELETE. Either way, _sessionId must be cleared.
// Option A: silent success (preferred — spec considers the session already terminated)
if (!response.ok && response.status !== 405 && response.status !== 404) {
throw new SdkError(SdkErrorCode.ClientHttpFailedToTerminateSession, ...);
}
this._sessionId = undefined;| describe('session expiry (HTTP 404)', () => { | ||
| const message: JSONRPCMessage = { | ||
| jsonrpc: '2.0', | ||
| method: 'test', | ||
| params: {}, | ||
| id: 'test-id' | ||
| }; | ||
|
|
||
| (globalThis.fetch as Mock).mockResolvedValueOnce({ | ||
| ok: false, | ||
| status: 404, | ||
| statusText: 'Not Found', | ||
| text: () => Promise.resolve('Session not found'), | ||
| headers: new Headers() | ||
| }); | ||
| // Per the MCP spec (Streamable HTTP, Session Management): a 404 in response | ||
| // to a request that carried an `Mcp-Session-Id` means the session expired and | ||
| // a new one must be started. Detection is by status code alone — non-reference | ||
| // servers report it with varying bodies, so we must not require a body shape. | ||
| it.each([ | ||
| ['plain text', 'Session not found'], | ||
| ['JSON-RPC -32002 (Figma Desktop)', '{"jsonrpc":"2.0","error":{"code":-32002,"message":"Session not found"},"id":null}'], | ||
| ['JSON-RPC -32001 (reference server)', '{"jsonrpc":"2.0","error":{"code":-32001,"message":"Session not found"},"id":null}'], | ||
| ['arbitrary HTML', '<html><body>404 page not found</body></html>'], | ||
| ['empty body', ''] |
There was a problem hiding this comment.
🟡 The new session expiry (HTTP 404) test group only exercises the POST path (_send via sessionTransport.send()); the GET SSE path (_startOrAuthSse), which this PR also modified to handle 404-with-session-ID and which constructs a different SdkError data shape ({ status, statusText } vs { status, text }), has no test. Consider adding a GET-path test (e.g. transport.start() + _startOrAuthSse({}) with a sessionId set and a 404 mock) mirroring the POST cases.
Extended reasoning...
Test-coverage gap: GET SSE (_startOrAuthSse) 404-with-session-ID path is untested
This PR adds the "404 with a session ID means session expiry" handling to two code paths in packages/client/src/client/streamableHttp.ts:
_send(POST) — around line 649: throwsSdkError(SdkErrorCode.ClientHttpSessionExpired,Session expired (HTTP 404): ${text}, { status: 404, text })_startOrAuthSse(standalone GET SSE) — around line 291: throwsSdkError(SdkErrorCode.ClientHttpSessionExpired, 'Failed to open SSE stream: session expired (HTTP 404)', { status: 404, statusText: response.statusText })
These are distinct branches with different error messages and different data shapes ({ status, text } vs { status, statusText }). The new describe('session expiry (HTTP 404)') block in packages/client/test/client/streamableHttp.test.ts only exercises path (1):
- The parametrized
it.each(...)test callssessionTransport.send(message), which routes through_send(POST). - The negative case ("404 without a session ID") also calls
transport.send(message)(POST).
Nothing in the test file ever invokes _startOrAuthSse (or transport.start() + a GET SSE attempt) with a session ID set and a mocked 404 response, so the GET-side branch is unverified.
Why this matters
The repo's review checklist requires that "new behavior has vitest coverage including error paths." Because the GET branch is a separate if with its own message and a divergent data shape, a future regression there (e.g. someone removing the GET-side branch during a refactor, or changing the data shape) would not be caught by the existing tests — they'd all stay green.
Concrete walk-through
- Construct a
StreamableHTTPClientTransportwith{ sessionId: 'existing-session-id' }. - Call
await transport.start()thenawait (transport as any)._startOrAuthSse({})(the test file already uses this private-method-access pattern for several other GET-path tests). - Mock
fetchto resolve with{ ok: false, status: 404, statusText: 'Not Found', text: () => Promise.resolve(''), headers: new Headers() }. - Currently no test asserts that the rejection is an
SdkErrorwithcode === SdkErrorCode.ClientHttpSessionExpired, thaterror.dataequals{ status: 404, statusText: 'Not Found' }, or thattransport.sessionIdis cleared afterward.
Suggested fix
Add a GET-path test inside the existing describe('session expiry (HTTP 404)') group, something like:
it('treats 404 on the standalone GET SSE stream with a session ID as session expiry', async () => {
const sessionTransport = new StreamableHTTPClientTransport(new URL('http://localhost:1234/mcp'), {
sessionId: 'existing-session-id'
});
(globalThis.fetch as Mock).mockResolvedValueOnce({
ok: false,
status: 404,
statusText: 'Not Found',
text: () => Promise.resolve(''),
headers: new Headers()
});
await sessionTransport.start();
const error = await (sessionTransport as unknown as { _startOrAuthSse: (o: StartSSEOptions) => Promise<void> })
._startOrAuthSse({})
.then(() => null, e => e);
expect(error).toBeInstanceOf(SdkError);
expect((error as SdkError).code).toBe(SdkErrorCode.ClientHttpSessionExpired);
expect((error as SdkError).data).toEqual({ status: 404, statusText: 'Not Found' });
expect(sessionTransport.sessionId).toBeUndefined();
await sessionTransport.close().catch(() => {});
});A matching negative case (GET 404 with no session ID still produces ClientHttpFailedToOpenStream) would round out the symmetry with the POST tests.
This is a test-coverage gap, not a runtime defect — the production GET-path code looks correct as written.
Motivation
The MCP spec (Streamable HTTP, Session Management) states:
StreamableHTTPClientTransportdid not implement this. Every non-401/403 error status — including404— was surfaced as a genericSdkErrorCode.ClientHttpNotImplemented(POST) orSdkErrorCode.ClientHttpFailedToOpenStream(GET), with no way for a caller to tell session expiry apart from any other HTTP failure.In practice consumers worked around this by string-matching the response body (e.g. looking for a
-32001JSON-RPC code). That only matches the reference server. Servers that report an expired session with a different body slip through and the session stays wedged until the app is restarted, for example:-32002)Change
Detect session expiry by status code alone, scoped to requests that actually carried a session ID (exactly what the spec describes):
404whenthis._sessionIdis set, the transport clears the stale session ID and throwsSdkErrorwith a newSdkErrorCode.ClientHttpSessionExpired. Clearing the ID means a subsequentclient.connect(transport)issues a freshinitializeinstead of a reconnect (Client.connect()branches ontransport.sessionId).404for a request that did not carry a session ID (e.g. a wrong URL on the initial connect) is unchanged and still surfaces asClientHttpNotImplemented.Applied to both the POST (
_send) and standalone GET SSE (_startOrAuthSse) paths.Tests
packages/client/test/client/streamableHttp.test.ts— replaced the single 404 test with asession expiry (HTTP 404)group:ClientHttpSessionExpiredfor a 404 with a session ID across varied bodies (plain text,-32002,-32001, HTML, empty), and assertingtransport.sessionIdis clearedClientHttpNotImplementedFull workspace typecheck, build, and lint pass (pre-push hook).
Docs
Added
SdkErrorCode.ClientHttpSessionExpiredand the new behavior todocs/migration.mdanddocs/migration-SKILL.md.